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Abstract 

Consider a system of coalescing random walks where each individual performs ran- 
dom walk over a finite graph G, or (more generally) evolves according to some reversible 
Markov chain generator Q. Let C be the first time at which all walkers have coalesced 
into a single cluster. C is closely related to the consensus time of the voter model for 
this G or Q. 

We prove that the expected value of C is at most a constant multiple of the largest 
hitting time of an element in the state space. This solves a problem posed by Aldous and 
Fill and gives sharp bounds in many examples, including all vertex-transitive graphs. 
We also obtain results on the expected time until only k > 2 clusters remain. Our proof 
tools include a new exponential inequality for the meeting time of a reversible Markov 
chain and a deterministic trajectory, which we believe to be of independent interest. 
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1 Introduction 

Consider a system of continuous-time random walks on a finite connected graph G, with a 
walker starting from each vertex of G. Let the walkers evolve independently, except that 
any two that occupy the same vertex of G at a given time coalesce into one (this is made 
precise in Section [372]) . 
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As time goes by, larger and larger coalesced clusters emerge, until at a certain random 
time C only one cluster remains. The question we address here is: how large can C be in 
terms of other parameters of G? This is a natural question which has implications for the 
so-called voter model on G, discussed in Section [Lll below. 

It is instructive to consider what happens in the simple case of G = K n , the complete 
graph on n vertices. An explicit calculation [21 Chapter 14, Sec. 3.3] shows that: 

C +oc Z- 

— ~ — * - - , with {Zj}j>i i.i.d. exponential random variables with mean 1. (1) 
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In particular, E [C] ~ n as n — > +oo. What is remarkable about this is that any two 
of the walkers will take an expected time ~ n/2 to meet and coalesce; the fact that we 
are dealing with an unbounded number of particles only increases the expected time by a 
constant factor. 

It is natural to ask what happens in more general graphs. This is closely related to the 
following problem, which was posed by Aldous and Fill in the mid-nineties. 

Problem 1 (Open problem 13, Chapter 14 of |2j) Prove that there exists a universal 
constant K > such that the expected value of C satisfies 

irrespective of initial conditions, where T^ t is the maximum expected hitting time of a vertex 
in G. 

To see how this relates to our previous discussion, consider a vertex-transitive graph 
G. Proposition 5 in [2, Chapter 14] implies that the maximum expected meeting time of 
two walkers on G, denoted by T^ eet , actually equals T^ t /2. This implies that, if Problem 
1 has a positive solution, all vertex-transitive graphs are like K n in that E [C] is at most a 
universal constant factor away from T^ cc ^. A similar conclusion holds for the many other 
families of graphs where T^ eet = G (Tjj t ) (eg. all regular graphs with T§ t = O (n)). For 
more general graphs it is still true that T^ eet < T£L, as proven in the aforementioned 
Proposition (see also TJ), and the Problem may be viewed as an strengthening of this fac10. 



1 There are graphs such as stars where T§ t is much larger than T^ eet or E [C]. 
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To the best of our knowledge, Problem 1 has remained open up to now. The best known 
bound of this sort has an extra ln|V| factor; see Proposition 11 in [2, Chapter 14]. Our 
main goal in this paper is to give a solution of Problem 1 in the more general setting of 
reversible Markov chains. 

Assume that Q is the generator of a reversible, irreducible, continuous-time Markov 
chain over a finite set V. Given v £ V, let H v be the hitting time of v, ie. the first time at 
which a trajectory of Q hits v. We define the following parameter of the chain: 

T^ it = max K w [H v ] = largest expected hitting time for Q. (2) 

ll,lll£V 

Define a system of coalescing random walks as in the case of graphs, with the difference 
that each walker now evolves over V according to Q. The following Theorem solves Problem 
1. 

Theorem 1.1 There exists a universal constant K > such that, with Q as above, for any 
n G N\{0} and for any x^ = (x(l), . . . x(n)) € V n ; 

E x(n) [C\<KT&. 

Remark 1 Here x^ is an initial condition, with n arbitrary. In particular, there may be 
more or less than one walker at each site v € V in the beginning of the process. Allowing 
for arbitrary initial conditions is convenient for our proofs, but does not really change the 
results. 

We also prove a stronger result. Let Ck denote the first time at which there are at most k 
clusters of coalesced walkers (k > 1). Notice that C\ = C with this definition. 

Theorem 1.2 There exists a universal constant K\ > such that, in the same setting of 
Theorem 

Vk e N\{0}, E z ( n ) [C k ] <KiHf. + , 

where T^ ix is the mixing time of Q (see Section UTB for a definition). 

The dependence on k in this Theorem is essentially best possible, as E^n) [Ck] ~ T^ it //c 
for large complete graphs. The case k = 1 gives back Theorem [Til as T^ ix < cT^ for 
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some universal c > [2, Chapter 4]. We will nevertheless prove Theorem 1 1 , 1 1 fir st and then 
show how its proof can be modified to obtain Theorem II, 2 i 

One justification for proving this second result is that it is helpful in approximating the 
distribution of C. We are in the process of writing a paper where we show that, if Q is 
transitive and T^ ix <C T^ t , then 



In particular, E [C] ~ T^ it . This was previously known only for discrete tori 7L L with L S> 1 
in d > 2 dimensions, due to Cox's paper [4p. An important step in both our proof and 
Cox's argument is that E [Ck] <S T^L if k S> 1. Cox proves this in [H Section 4] via a 
simple renormalization argument which is very specific for discrete tori, whereas we use 
Theorem 11.21 for the same purpose. 

1.1 Application to the voter model 

We now sketch the connection between our results and the voter model [61 [2] on a graph G 
(this could be generalized to an arbitrary generator Q, but we will not do this here). The 
state of the process at a given time t is a function: 



where V(G) is the vertex set of G and O is a fixed set of possible opinions. The evolution 
of the process is as follows. Each vertex v £ V(G) "wakes up" at rate 1; when that happens 
at a time t > 0, v chooses one of its neighbors w uniformly at random and updates its value 
of rjt(v) to w's opinion rj t _(w); all other opinions stay the same. 

A classical duality result (see eg. [6j Chapter 5] or [21 Chapter 14]) relates the state of 
the process at a given time to a system of coalescing random walks on G moving backwards 
in time. In particular, the consensus time for the voter model - ie. the least time at which 
all vertices of G have the same opinion - is dominated by the coalescence time C from the 
initial state with all vertices occupied. This implies the following Corollary of Theorem ll.il 

2 Transitivity can be dropped at the cost of making stronger assumptions on Q and using a different 
normalization factor. 




hit 



Vt : V(G) -> O 
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Corollary 1.1 There exists a universal constant K > such that, for any graph G and 
any set O, the expected value of the consensus time of the voter model defined in terms of 
G and O, started from an arbitrary initial state, is bounded by if T£L 

Proposition 5 in [2^ Chapter 14] shows that the Corollary is tight up to the value of K 
for vertex-transitive G, at least when the initial conditions are iid uniform over {—1, +1} 
(say); we omit the details. 

1.2 Main proof ideas 

Let us give an outline of the (elementary) proof of Theorem 11.1) the proof of Theorem 11.21 
is quite similar. For clarity, we first present an oversimplified account, and then explain 
how one can avoid the oversimplifications. 

We label the n walkers {Xt{a))t>o with numbers a = 1, . . . , n. Instead of having walkers 
coalesce, we will assume that a walker #b will kill any walker #a with a > b that happens 
to be in the same state as itself (this is made precise in Section I3.3P . The number of walkers 
that are alive at time t in this process is precisely the number of clusters in the coalescing 
random walks process, and C is the first time at which only walker #1 is still alive. This 
implies that: 

n 

P (C > t) < ^ P (walker # a alive at time t) . 

a=2 

We now make the following oversimplification: 

Oversimplification #1: walker #a dies at the first time when Xt(a) = Xt{b) 
for some b < a. 

The reason why this is an oversimplification is that a walker j^-b may have died before 
meeting walker #a. For the moment, we ignore this and write: 

P (walker # a alive at time t) < P (c\ {VO < s < t, X s (a) / X s (b)}j . 

In order to simplify the RHS, we notice that the trajectories (Xt(u))t>o of walkers 

1 < u < a, are independent realizations of Q. Conditioning on X s (a) = h s , s > 0, makes 



5 



the events in the RHS independent, and we deduce: 

a-l 

P (walker # a alive at time t | (X s (a)) s > = (h s ) s > ) < P (VO < s < t, X s {b) / h s ) . 

6=1 

We now make another oversimplification. 

Oversimplification #2: (Xt(b))t>o is started from the stationary distribution 
for all b. 

This allows us to use the following Lemma, which we believe to be new (and of inde- 
pendent interest). 

Lemma 1.1 (Meeting time Lemma; proven in Section I5.2|) Let (X t )t>o be a real- 
ization of Q starting from the stationary distribution tt. Then there exist v £ V and 
a quasistationary distribution q v for V\{u} such that for any deterministic path h G 
B([0,+oo), V), we have: 

Vt > 0, (VO < s < t, X t ± h t ) < P q „ (H v >t) = exp 

Remark 2 The proof of Lemma shows that we may take v £ /i([0, +oo)). This is a 
well-known result if h = v JH Chapter 3, Section 6.5]. An application of this Lemma to 
so-called cat-and-mouse games is sketched in the final section. 

Notice that E qv [H v ] < Tg t , so: 



P (walker # a alive at time t | (X s (i)) s >a 
This shows that: 

n (q-l)t 

P(Oi) <Y, e 

a=2 

If one takes t = (In 2 + c)T^ it , the RHS becomes: 

n 

P (C > (In 2 + c)Tg t ) < ^ 2 - a+1 e- (a - 1)c < e" c , 

a=2 



(a-l)t 

= (/i s ) s >o) < e T h« • 
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and this gives E [C] < (In 2 + 2)Tj t . 

Of course, this is not a proof of Theorem 11.11 because of the oversimplifications. Our 
way out of this is to introduce a process where at any given time there is a list of allowed 
killings. At any time t there will be a set At, so that walker #6 may kill walker #a at time 
t only if b < a and (b, a) G At (cf. Section [3.4p . The salient characteristics of this process 
are: 

1. For any choice of A = (At)t>o, the set of alive walkers in the process defined via A 
dominates the corresponding set in the process without A (see Proposition I3.2|) . 

2. A judicious choice of A will ensure that for each a, there will be a large enough time 
interval where a large number of walkers will be available to kill walker #a. Moreover, 
many of these will be stationary. 

Item 1 allows us to consider the process with a list of allowed killings instead of the 
original process in order to obtain upper bounds. Item 2 will mean that we may apply 
the Meeting Time Lemma to at least some of the walkers with indices b < a, in some 
time intervals. These two ingredients will allow us to "fix" the oversimplified proof just 
presented. 

1.3 Organization 

The remainder of the paper is organized as follows. Section [2] introduces our notation and 
recalls some basic concepts. Section defines the main processes we consider in the paper. 
Section 0] presents the proofs of the two Theorems, and Section [5] presents the proof of 
Lemma ll.H Some final comments are presented in the last Section. 

2 Preliminaries 

In what follows we recall some basic material while also fixing notation. 
2.1 Basic notation 

N is the set of non-negative integers. Given n G N\{0}, we set [n] = {1, 2, . . . , n}. 
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We will often speak of universal constants. These are numbers that are independent of 
any other object or parameter under consideration, be it a Markov chain, the initial state 
of a process under consideration or anything else. 

The cardinality of a finite set S is denoted by \S\, and 2 s represents the power set of 
S (ie. the set whose elements are the subsets of S). The set of all probability measures 
over S will be denoted by M\(S). denotes the space of all functions / : S — > K, or 
equivalently of all (column) vectors with entries indexed by S. Linear operators acting 
on W s correspond to matrices with rows and columns indexed by the elements of S. If A 
is some matrix of this sort, ||-A|| op is the operator norm of A. If A is symmetric, we let 
Amin(^4), A max (A) denote the minimum and maximum eigenvalues of A (respectively). 

Given a finite set F / 0, a function uj : [0, +oo) — > F is said to be cadlag if there exist 
to = < t\ < t2 < • • • < t n < • • • /■ +oo with uj constant over each interval [tj_i,ij). 
O([0, +oo), F) is the set of all such cadlag functions, with the <r-field generated by the 
projections u uj >->■ uj(t) v (t > 0). 

2.2 Markov chain basics 

Let V be a finite, non-empty set. A matrix Q (with rows and columns labelled by V) which 
acts on M v in the following way: 

Q :/(.)€ RV £ q(;x)(f(-)-f(x)),withq(;-)>0 

defines a unique continuous-time Markov chain on V. More precisely, there exists a 
unique family of measures {P x } xe v over D([0, +oo), V) (with the cr-field generated by 
finite-dimensional projections) such that, letting 

X t : UJ G D([0, +oo), V) ^ X t (uj) = uj(t) £ V 

and Ft = cf{X s : s <t), we have (Xq = x) = 1 and 

F x (Xt+ a = y\F s )= the entry of e~ tQ labelled by (X t , y) (t, s > 0,y G V). (3) 

Q is said to be the generator of the Markov chain and the numbers q(x, y) (x, y € V, x ^ y) 
are the transition rates. We let ~E X [•] denote expectation with respect to P x . 
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We also define 

P M = ^ (/i€Mi(V)) 
xev 

which we interpret in the customary way, as describing the law of the chain given by Q 
from a random initial state with law \x. [•] is the corresponding expectation symbol. 

We will always assume that Q is irreducible, meaning that for all A C V with A, V\A ^ 
there exist a G A and b G *V\A with q{a, b) 7^ 0. In this case there exists a unique 
probability measure ir G Mi(V) which is stationary in the sense that P^ (Xt = •) = ir(-) for 
all i > 0. Moreover, we have that: 

Vx,yGV, lim F x (X t =y)=ir(y). 

f— !-+oo 

The mixing time of Q measures the speed of this convergence: 

T mix = mf |t > : Vx G V, max |P X . {X t G S) - n(S)\ < I/4I . 

Finally, we will also assume that Q is reversible with respect to 7r, which means that 
ir(x)q(x,y) = n(y)q(y,x) for all distinct x,y G V. This is the same as requiring that the 
matrix IT 1 / 2 QH^ 1 / 2 is symmetric, where II is diagonal and has the values vr(f), v G V on 
the diagonal. 



3 Processes with multiple random walks 

We define here the main processes that we will be concerned with, all of which involve n 
random walkers for some integer n G N\{0, 1}. We will assume that Q and {P^j^gv are as 
defined in Section 12.21 

3.1 Independent random walks 

We first define a processes made out of n independent realizations of the Markov chain with 
generator Q. More specifically, given 

x^ = (x(l),x(2), . . . , x(n)) G V n , 
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we let ¥ x ( n ) denote the distribution on D([0, +00), V n ) corresponding to n independent 
trajectories of Q, 

(Xi n) ) t > = (X t (a) : a G [n]) t > , (4) 

with each (X t (a))t>o started from x(a). That is, the joint law of {(X t (a))t>o} a e[n] is the 
product measure: 

IV) x &x{2) x ••• X F x(n) . 

Notice that our notation ¥ x ( n ) does not refer explicitly to the fact that this is a process on 
V n , as opposed to the process over V defined in the previous subsection. This distinction 
should be clear from context and from the fact that we write all x^ G V n with a "(n)" 
superscript. The independent random walks process is also a Markov chain: for = 
. . . , x{n)) and y( n > = (y(l), . . . , y(n)) distinct, the transition rate from x^ n ' to y^ n > is: 

(«)/ (n) („h _ / l(x(i),y(i)) if x(i) + y{%) and x(j) = y(j) for all j G [n]\{i}; 

q \ x -.y ) = \ , \p) 

I U otherwise. 
3.2 Coalescing random walks 

For our purposes, it is convenient to define this process, denoted by 

(Co| n) )t> = (Coj(a) : a G [n]) t >o 

as a deterministic function of the independent random walks process. The idea is that, 
once a walker meets another walker with smaller index, it starts following the trajectory of 
the latter. That is, consider a realization of P x ( n ) as in @. First define: 

Cot(l) = X t (l) (t>0). 

Given a G [n]\{l}, assume inductively that (Cot(b))t>o has been defined for 1 < b < a. 
Since Q is irreducible, there cl.S. IS 3j first time r a at which Xt(a) = Qot{b) for some 
6 G [a — 1]. More precisely, define: 

T a = M{t > : 31 < b < a, X t (a) = Co t {b)} 
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and 

B a = min{6 € [a - 1] : X Ta (a) = X Ta (b)} 

and then set: 

, . f X t (o), < £ < r a ; p 
Cof(a) = < for each £ > 0. 

[ Co t (B a ), t > T a ; 

One can show that the law of (Co^)t>o is invariant under permutations of the x(i). We 
also define the set: 

S t = {v e V : 3a E [n], Co t (a) = v} (6) 

as the set of occupied sites in this process. Our definition of (St)t>o coincides with the more 
traditional coalescing random walks process defined in eg. [1]. We also set: 



C k = inf{£ > : \S t \ < k} (jfe € N\{0}) 



and C = Ck- 



Remark 3 We note that this process makes sense even if contains repeats, ie. if there 
exist i j with x(i) = x(j). 

3.3 Random walks with killings 

Let d V be a "coffin state" . We define a new process 

(y/ n) ) t > = (Y t (a) : a € [n]) t > . 

The new idea is that a walker with index a will be killed by a walker of index b < a 
occupying the same site. More precisely, we first define: 

Y t (l)=X t (l) (£>0). 

Given a £ [n]\{l}, assume inductively that (Yj(6))t>o has been defined for 1 < b < a. 
Define: 

r a = inf{£ > : 31 < b < a, X t (a) = Y t {b)} 
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and set: 



, . XAa), <t <r a : 

Yt(a) = I ' for each t > 0. 

d, t> T a - 



Although our new definition of r a different from the previous one, it is easy to show that 
the two definitions coincide, and that in fact: 

Proposition 3.1 (Proof omitted) Let St be as Then for all t > 0, 

S t = {v£V : 3a e [n], Y t {v) = a} 

and 

\S t \ = \{ae[n] : Y t (a) ± d}\. 

Therefore, for all k £ N\{0}, 

F x(n) (C k >t)= F x{n) (\S t \ >k + l) = F x(n) (\{a G [n] : Y t (a) ^ d}\ > k + 1) . 

Remark 4 As in Remark^ we may allow where x(i) = x(j) for some pair i ^ j. 
Notice, however, that Yq ^ x^ in this case. 

3.4 Random walks with a list of allowed killings 

Now assume that we have a deterministic cadlag trajectory: 

A : t > 0^ 2W 2 . 

We define yet another process: 

((Y t A )(%>o = (Y t A (a) : a G [n]) t > 

where a walker with index a may be killed by a walker with index b only if they occupy the 
same site at some time t and (b, a) G At- Intuitively, this means that b is allowed to kill a 
only at times t with (b,a) G At- 

For a formal definition, we first set: 

Y*(l) = X t (l) (t>0). 
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Given a G [n]\{l}, assume inductively that (Y t A (b)) t >o has been defined for 1 < b < a. 
Define: 

t a = inf{t > : 31 < b < a, (6, a) G A t and X t (a) = Y t A (b)} 



and set: 



A. 



YAa) = \ Xt(ah °- t . <T »' foreacht>0 



d, t > t 



A. 

a i 



The following Proposition shows that the process with a list of allowed killings can be used 
to upper bound E x (n) [Ck]- 

Proposition 3.2 Define: 

Sf = {Y t A (a) : a G [n]}. 

For any choice of A as above and of initial state , one can couple (St)t>o an d (S A )t>o 
such that (almost surely) St C Sf' for all t > 0. In particular, for all k G N\{0}, 

F x(n) (C k >t)= F x(n) (\S t \ > k + 1) < P x(n) {\Sf\ > k + 1) . 

We omit the proof of this rather intuitive Proposition. The key idea here is this: suppose 
we do not kill a walker a at a given time to- The only way this could make St "smaller" 
is if Xt{a) were to meet a walker Xt{c) with c > a at some later time t > to. But if this 
happens, we may pretend that X s (a) follows the trajectory of X s (c) for s > t; this follows 
from the Markov property coupled with the fact that Xt(a) = Xt(c). This shows that in 
fact St does not become smaller. 

Remark 5 Similarly to Remark^ we note that we may allow with x(i) = x(j) for 
some pair i ^ j, but then (Y A )^ ^ x^. 



4 Proofs of the main Theorems 

We prove Theorems 11.11 and 11.21 in this Section. For simplicity, we first focus on the proof 
of Theorem and then show how it can be modified to prove the second Theorem. We 
will take the notation and definitions in Sections 12.21 and [31 for granted. 
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4.1 Preliminaries for the proof of Theorem 11.11 

We first note that Theorem 11,11 follows from a seemingly different statement. 



Proposition 4.1 Let c, 7 > be given universal constants. Suppose we can show that there 
exists some choice of A = (At)t>o as in Section \3^\ and of < to < c(T^ ix + T^ it ) with 

Vn G N\{0}, Vz» G V", P x(n) > 2) < 1 - 7. 

Then 

Vn G N\{0}, Vz^ G V n , E x(n) [C] < K Tg t 
where K > is universal. 

Proof: Given s > 0, denote: 



= {C > s} = {|5 fl | > 2} = (J {r a >s}. 

oe[n]\{l} 



(7) 



Combining the assumption of the Proposition with Proposition 13.21 gives: 

Vn G N\{1}, Vx< ft > G V™, P^, (E(i )) < P, ( „) (|S£| > 2) < 1 - 7 (8) 

We now consider E(Hq) where t > 1 is an integer. Let (G s ) s >o denote the time-shift 
operators for the independent random walks process and let (Jg ) 3 >o denote the filtration 
generated by this process. 

F x(n) (E(kt )) < F x(n) (E((£ - l)t ) n (U™ =2 {r a o 6 (/ _ 1)to > i })) 

= p x(n) - i)t ) n e-/_ 1)to (£(t ))) 



(^-l)to)e4 n 2 1)to ) 

(Markov property) 



< E^ 



E 



(n) 



(n) 



*E((t-i)to)Kw (e {i i 1)to (E(t )) I ^ ( ( 2 1)t0 ) 



^((W)to) r jW 1l( 



(^(*o)) 



(inequality ©) < E l(n) [l E ((^_i) to ) (1 - 7)] 

= F x(n) (E((e-l)t )) (I-7) 
(induction on k) < (1 — 7) . 
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Recalling the definition of E(s), we deduce that: 

Vn G N\{1}, Vx™ G V", E_ (n) [C] < V (1 - 7)% = - < C (T ^ ix + ^ , 

een 1 1 

The Proposition follows from this because T^ ix < cq Tj t for some universal cq > [21 
Chapter 3] and both c and 7 are universal. □ 

4.2 Construction of A 

Notational convention 1 From now on, we fix some sp( n ) and wzie P instead o/P x ( n ). 

We will now design a specific trajectory A = (At)t>o which will allow for a simple analysis 
of Sf-. Let m G N be the smallest non-negative number with n < ^ ■ Define sets 



A = {1}; 



i=0 



and A r _ 



N\ 



\ 

i=0 



,i=0 



(1 < j < m - 1); 



m— 1 



We will consider different epochs, numbered backwards in time. It is convenient to have 
the following notation. 



2T 



mix' 



tj+i + (In 5) 2 4 "^' Tg t , j = m - 1, m - 2, . . . , 0. 



(9) 



1. Epoch #00 is the time interval [0,t m ). We set At = for all i in this interval, ie. no 
killings are allowed up to time 2T^ ix . 

2. Epochs #m through #1 correspond to time intervals Ij = [tj, ij-i) as j decreases from 
m to 1. For each such j we set: 



A t = Aj-\ x U™ =j Aj, telj. 



15 



That is, the only killings allowed are between walkers with labels in A,_i and A p with 
P > 3- 

3. Epoch #0 corresponds to the time interval, 

= [*o,+oo) , 
(the remaining time), where we set At = [n] 2 . 
We note for later convenience that: 

t < 2T« x + (In 5) £ 2 4 ^ Tg t < c (T« x + T&) (10) 

with c > universal, since ^ ■ 2~ J < +oo. We will use this in our application of Proposi- 
tion ED 

4.3 Abundance of good walkers 

We have the following simple proposition about the epoch #oo. Intuitively, it says that, 
at the end of this epoch, a positive proportion of the random walkers are "good" , in that 
they have converged to stationarity. 

Proposition 4.2 One can construct a (random) subset R C [n] such that: 

1. R is ~Ht -measurable, where ~h[ is the sigma-field generated by (xi n ^) s <t m and by 
some additional independent random variable U. 

2. Each r G [n] belongs to R with probability 1 /4, independently of all other r' G [n] . 

3. Conditionally on R and on (X tm (i))ie[n]\R> the vector (X tm (r)) r( zR has iid coordinates, 
each with distribution ir. 

Proof: Consider a single a G [n]. Since Q is reversible, Lemma 7 in [2, Chapter 4] shows 
that: 

Va G [n], Vv G V : P (x Q (a) = v) > 

\ mix / 4 
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in other words, for each a there exists some v a G Mi(V) such that: 




Since the random variables (X tm (a)) a€ ^ are independent, we may assume that they sam- 
pled as follows: 

1. Let (I(a)) ae[n] \ Am be iid with P (1(a) = 1) = 1 - P (1(a) = 0) = 1/4. 

2. For each a with /(a) = 1, let Xf m (a) be a sample from 7r, independent of everything 
else. 

3. For each b with 1(b) = 0, let X tm (b) be a sample from i/f,, independent of everything 
else. 

One may check that R = {a G [n]\^4 m : 1(a) = 1} has the desired properties. □ 

The next proposition means that, with positive probability, there is a constant propor- 
tion of good walkers within each Ai with i < m — 1. 

Proposition 4.3 Let Q be the event: 

m— 1 

g= p| {\RnAi\ > 2 1 - 3 }. 

i=0 

Then P(g)>a>0, where 

a = Yl(l-e- T ~ 7 ) > 

is universal. 

Proof: Let Bin(m, x) denote a binomial random variable with parameters m and x, so that: 

P (Bin(m, x) = k)= Q x k (l - x) m ~ k (k G [m] U {0}). 

The random variables Ni = \RC\ Ai\, < i < m — 1 are independent, and each TVj has the 
law of Bin(2 4 , 1/4). Chernoff bounds [3, Appendix A.l] imply: 

F(\RnAi\ < 2 l ~ 3 ) =P(Bin(2\l/4) < E [Bin(2\ 1/4)] - 2 i_3 ) < e = e~ 2% ~ 7 . 
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We deduce: 

m—l 



'(G)> J! P (Bin(2\ 1/4) < 2 1 '- 3 ) > a. 



i=0 

The positivity of a follows from < e~ 2 ' 7 < 1 for all i and £V e~ 21 7 < +oo. □ 

4.4 The probability of being alive 

Let E(a, t) denote the event: 

E(a,t) = {Y t A (a)^d} = {r^>t}. 

Notice that: 



{\S A \>2}c (jE(a,t). (11) 

a=2 

We will now compute estimate the conditional probability of E(a,t) given Q. 
Proposition 4.4 Let a € Ai for some 1 < i < m. Then for all 1 < j < i: 

¥(E(a,tj) | Q) < 5 j -\ 
Proof: We will prove a stronger statement: that for almost all Rq C [n] and (ht)t£[t i: tj) : 

P \ R = Ro, (X t (a)) te[taj) = (/i t ) te[ti , tf) ) < 5-^=5 2 ^ d . (12) 

This implies the proposition because the occurrence of £/ implies |i? n ^4 r | > 2 r_4 for all 
1 < r < m — 1. 

To prove (|12[) we first observe that the event E(a,tj) satisfies: 
Claim 4.1 Suppose b £ A r with j < r < i. Then: 

E(a,tj) C {Vt € [tr+i,tr), *t(<0 ^ 

Proof: [of the Claim] If the event in the RHS does not hold, there exists a t 6 [i r ,t r _i) 
with Xt(a) = Xt(b). We now argue that t a < t in this case. Indeed, this follows from the 
definition of r A and the following observations: 
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1. X t (a) = Y t A (b): this follows from X t (b) = Y t A (b), which is a consequence of the fact 
that (6, c) G" A s for any c > b and s <t r (ie. 6 cannot be killed before time t r ). 

2. (6, a) G ^.j: this follows from i G [t r+ i,t r ) = I r +i. 
□ 

The Claim implies: 

F(E(a,t 3 ) \R = R ,(X t (a)) te[tuti _ l) = (h)^^) 
< IP ( Pi n < V * G Mr-l), *t(a) / I i? = i? , (Xi(a)) t6[ti)t .) = (/n)t eM .) 

\r=j beA r 

< P ( p| P {Vt G [i P+ i,M> Ma) + X t (b)} \R = R , (X t (a)) teM = (h t ) te[tutj) J . 
\r=j b£A r nR j 

Now observe that we are conditioning on R = Rq and on the trajectory of (X s (a)) s€ [ tut ._ 1 y 
Since a G" Proposition 14.21 implies that: 

Under the conditioning, (X tm (6) : b G i?o D fu^-A^J) are iid with common law ir. 

Since R is W^-measurable, the Markov property for the independent random walks process 
implies that 

Under the conditioning, (X t +t m (b) : b G Ro Pi fujZ^A r J )t>o are iid realizations of P^. 
We deduce: 



(E(a,tj) \R = R , (X t (a)) te[tutj) = (h t ) te[u ^ 



= n n p - ( v * e [*h-i. ^ 

r=j b£RonA r 

i-1 

\ is stationary) = ]J P^ (Vt G [0, i r - t r+ i), X t (6) ^ /i 



\\RonA r 

t+t r ) 
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We apply the Meeting Time Lemma (Lemma 11.11 above) to each term in the product and 
deduce that, for some choice of (v, q v ) as in the Lemma, 



(E(a,ti-x) | R = R ,(X t (a)) te [ tiitt _^ = faOtefo.ti-i)) 



< 



i-l 

n 

r=j 



(t r — t r+ i)\Ro n A r 



The proof of (O finishes once we realize that t r -t r+1 = 2 4 " r (ln 5)TjJ t and Tg t > E q „ [#«]. 

□ 

4.5 End of proof of Theorem 11.11 

We now complete the proof of Theorem ll.il By Proposition 14.11 it suffices to show that: 

P.Cn) (|S#|>2)<l-7 

for some universal 7 > 0, with to as in (jlOp . To see this, we will use (jlip and recall our 
convention of omitting from the notation (cf. Notational convention [T]) . 



P(|S#|>2) = P^U E(a,to)j 

/ n N 

as in Prop. H3D < 1 - P (G) + P I £ f~l |J #(a,t ) 



a=2 



(union bound) < 1 - P (Q) + P (5 n E(a, to)) 

a=2 

m 

(H\{i} = u^) < i-p(g) + p(g)^^p(^(a,t )|g) 



+00 



/ 2\ 

(Prop. S31+ < 2*) < i-p(g) + p(g)^ f -J 

i=l ^ ' 

n P(0) 
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Since P {Q) > a for some universal a > (cf. Proposition I4.3p . we deduce: 



P x (n) (\S$\ > 2) < 1 - 7 with 7 = - universal. 



This finishes the proof. 

4.6 Proof of Theorem [TT21 

We now present the modifications of the previous proof that are necessary to prove The- 
orem ll.2i We keep the definitions from previous subsections. We will also assume that 
k > 4, so that there exists some j 6 [m] with: 



in fact, we will assume that j is the largest number satisfying this, so that 2- ?+2 > k/2. 
(The case of k < 4 follows from Theorem II. 1\ with an increase in the universal constant if 
necessary. If m is too small to allow for this choice of j, we may increase n - and thus m - 
at the cost of having more walkers in the beginning of the process.) 
We first need an analogue of Proposition 14.11 

Proposition 4.5 (Proof omitted) Suppose that there exists a universal 7 > such that 
for all k as above, all n G N and all G V n ; 



We omit the proof of this, which follows that of Proposition 14.11 quite closely. The key 
point is to notice that: 



h = 2^ +1 



l = l + 2 + -- - + 2 J < k/2; 




(13) 



Then there exists a universal K\ > with: 




h = ZTgfc + £ 2 4 "* (In 5)Tg t < 2T« x + c x 2 




iQ 

hit 



i=j 



k 
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with cx,C2 > 1 universal (here we used 2 J+2 > k/2). 



We will now bound . 



tion[TJ we first observe that, since h < k: 



> + 1 ) in terms of h and fc. Using Notational conven- 



(\S$\>k + l 



/ Va=ft.+1 / 



\a=l 



Now follow the long chain of inequalities in the previous subsection to deduce: 

/ m 

p(|s£|>* + i) < i-F(g)+F(g)F[ i E{a;tj) >k-h + i\g 

(Markov ineq.) < 1 - P (Q) + P (g) 



Va=h+1 



k-h = l 



E 



({n]\[h]=UT =j+1 A i ) < 1-P(0)+P(0) 



^2i=j+l ^2aeAi ^B(o,tj) 



i_P(g) + P(g) £ ]T 



k-h 

P(£(a,*j)) 



(Prop, in + |^| < ^) < i-p(g) + p(g) ^ 

= i-p(s)+p(0) 



v 2J (l) J_i 



fc-/i + i 

i=j+i 

2 i+i 



(2^' +1 = /*+!) < l-P(£)+P(0) 



3(k-h+l) 
h + 1 



3(k-h + l) 
(h<k/2) < =1-P(0)+P(0) A ' + 2 



= 1-P@)+P(0) 
(fc>4) < l-^P(^)- 



6(fe - fc/2) 
k + 2 



3k 



To finish, we note that P (£?) > a > with a universal (Proposition I4.3j ). hence we may 
take 7 = 8a/15 in Proposition 14.51 
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5 On the Meeting Time Lemma 

5.1 Preliminaries on quasistationary distributions 

In this section we review some facts about quasistationary distributions that will be needed 
in the proof of Lemma [l.ll We will use the definitions of Section ?2 . 2 1 throughout the section. 

Given any v G V, we let q v be a quasistationary distribution for V\{t>}: that is, 
q v G Mi(V) satisfies 

Vfe G V, q„(6) = P q „ (X t = b | H v > t) . 

All quasistationary distributions q v corespond to eigenvalues of restricition of n 1//2 £,)n -1 / 2 
to a subspace MY V of M v defined below. Here is the recipe. 

1. Consider the subspace: 

M V „ = {u G K v : u(v) = 0} 
and let V- v : M v — > MY„ denote the standard projection onto RY^. 

Q-v = V-v&^QHr^V-v 

is a symmetric linear operator from IRYt, to itself with identical diagonal entries and 
non-positive off-diagonal entries in the "obvious" basis for that space, ie. the one 
given by the canonical basis vectors e&, b G V\{u}. 

2. By Perron-Frobenius, each irreducible block of the matrix Q- v has a unique eigenvec- 
tor w v G K v „\{0} with non-negative entries which achieves the smallest eigenvalue 
X(w v ) corresponding to that block. 

3. A simple calculation shows that the vector: 

defines a probability distribution over V with: 

P q „ (X t = b,H v >t)= qte- tV -* QV -e b = e- X(w ^q v (b), 
which in particular implies that q„ is a quasistationary distribution associated with 
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v. In particular, E qt) [H v ] = 1/X(w v ) > 0. Notice moreover that q v (v) = 0. 

The following proposition - an immediate consequence of the third item above - will 
be all we need. 

Proposition 5.1 Let Q~ v be defined as above and let X(v) denote the smallest eigenvalue 
°f Q-v Then there exists a quasistationary distribution q„ for V\{u} such that X(v) = 
1/E q „ [H v ] and: 

P qv (H v >t) = e- x ^. 

Proof: This smallest eigenvalue is the smallest eigenvalue of some block of Q- v , and 
thus equals some w v . The rest follows from item 3. and from summing the formula for 
P q „ (X t = b,H v > t) over b. □ 

5.2 Proof of the Meeting Time Lemma 

Proof: [of Lemma [TTj Fix n G N\{0},0 < A < n. We note that: 

P,r (V0 < s < t, X s ^ h s ) < {n? =1 {X(it/n) / h(it/n)}) 



' n ( A \ 

I! ( 1 ~ n l {X{it/n)=h(it/n)} ) 

i=i v ' J 



For a given v G V, let D v be the matrix with a 1 in position {v , v) and 0s elsewhere. A 
calculation reveals that the RHS above can be rewritten as: 

(ni)t L-%(i- ADh ^A ) ( i _ AD mn\ \ r_* / 1 _ a%. 



n / V n I I \ n 



where 1 is the all-ones vector and IT = diag(7r(u)),; g v was introduced in Section I5TT1 Since 
IT commutes with all D v , we can rewrite the above expression as: 

[l<i<n ^ n ' J 

where the Y\* symbol means that the order of the terms in the product is from left to right. 
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The vector II 1 / 2 1 has norm | IT 1 / 2 1 1 2 = ^2 v tt(v) = 1. This implies that the above 
expression is at most the operator norm of the product of matrices. It follows that: 



F w (VO < s < t, X s / h s ) < 



n ■ 

Ki<n 



h(it/n) 



n 



op 



Since the operator norm is submultiplicative, we obtain: 



\ (VO < s < t, X s / h s ) < Yl 



i=i 



tn 1 / 2 Qn~ 1 / 2 



< max 

\ t-GV 



I - 



AD 



h(it/n) 



n 



op 



tnV^n-'/ 2 / AD V 

e n \ I 

n 



• (14) 



op. 



We now consider the terms of which we take the maximum in the RHS, for large n £ N. 
For a given v 6 V, we have: 



tnVaQn-V 2 / AD V \ tn^Qn^V^-Ao^, 

e « \ I ) — e 

n 



O (n- 2 ) , 



op 



where the constant implicit in the O (n 2 ) term depends only on A, t and Q (and not on 
a, say). Letting n — > +oo while keeping A fixed, we get: 



lim max 

n— >+oo \ veV 



l/2 on -l/2 



tn 1 /^Qn 



AA 



n 



lim max 



op/ 

tn 1 / 2 Qn~ 1 / 2 -A_p^ 



max 



opy 



-inVZgn-i/Z-AD,, 



op 



(15) 



Indeed, last the line follows from the self-adjointness of the exponential and from the fact 
that ||i? fc || op = ||i?||op for self-adjoint matrices B. We now use the positive-definiteness of 
matrix exponentials, together with the spectral mapping property, to deduce: 



Vv G V, ||e -tnl/2Qn ~ 1/2 " ADl, ||op = A max (e~' nl/2Qn_1/2_ADt ') = e -<Wn(tn 1 /2Qn- 1 /2 + A£> v )_ 
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This implies: 



\(VO<s<t,X s ^h s )<exp(- min A min (ffl 1/2 Qrr 1/2 + ALU J . 

y «ev,A>o J 



We now make the following Claim. 
Claim 5.1 As A /■ +oo, 



X^m^QUr 1 ' 2 + AD V ) -> iA min (Q_,) 



where Q- v is defined as in Section \5.1[ 

This result is probably well-known; for instance, it is a weaker variant of Lemma 3.1 in [5]. 
We will prove it below for completeness, but first we deduce from it that: 



(VO < s < t, X s £ h s ) < exp -tmm\ min {Q_ v ) = e = F qv (H v > t) 

\ »ev j 

via Proposition 15-H where q„ is some quasistationary distribution associated with v. 

We now prove the Claim. Recall the definition of V- v in Section 15.11 and notice that 
D v = I — V- v . This shows that D a w = for all w £ MY^ and therefore: 

A min (tn 1 / 2 Qn _1 / 2 + AD V ) = inf w^(tU 1/2 QU~ 1/2 + A A.) 

< inf w\m 1/2 Qir l/2 + AD v )w 

w£RY v , \w\=1 



inf w^tll^QU-^w 

" |to|=l 



(use V- V w = w) = t inf w\V- v H l/2 QIi' 1/2 V- v )w 

wmY v , |w|=i 

= t\ mhl {Q- a ). (16) 

To get an opposite inequality, we set A = iY^^QW" 1 / 2 for convenience. We first show that 
there exists some c > such that for all large enough A > 0, 

cV A 

A + AD V h V- V AV. V - + - D v , (17) 
where for symmetric matrices £?i,l?2 with the same size, B\ X B% means that B2 — B\ is 
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positive semidefinite. To see this, we use V- v + D v = I several times and notice that for 
any x G M v , 

x ] {A + AD v )x = x ] V- v AV- v x + x' [ D v AD v x + 2x' [ {'P- v AD v )x 
+A(x^D v x) 

(Cauchy-Schwartz) > x^V~ v AV- v x + x^ — ^-x + \D v x\ 2 ^— — ||j4|| op ^ 

2||t4||op \P — v%\ I 
(assume A > 4p|| op ) > x t (V- V AV- V + — ^ ) % 



y/A\D v x\ 2\\A\\ ov \V- v x\\ 4|| A|| op |7^_„x| 2 



A ) A 



+ 

(set c = 4[|A|| op ) > x ^(v_ v AV„ v -^ + ^\x. 

This proves (11T|) . which implies: 

A min (^ + AD V ) > A min \V- V AV- V - + -D V A. (18) 

Notice that the matrix in the RHS has MYt, as an invarant subspace, which implies that 
all of its eigenvectors lie in MY V or in its orthogonal complement. It is easy to see that the 
all vectors in the latter space are eigenvectors with eigenvalue A/2; therefore, for all large 
enough A the minimal eigenvalue corresponds to a vector in MY W - We deduce that for all 
large A > 0, 

A • (v AV - cV ~ v + — D \ 

(cP \ 2c 
A J A 

because w^V- v AV- v w = tw^Q- v w for all w as above. Together with (fTBj) and (fl8j) . this 
shows that: 

For large enough A > 0, tA min (Q_„) - ^ < X^m^QTT 1 / 2 + AD V ) < t\ min (Q_ v ), 
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and the Claim follows when we let A +00. □ 

6 Final remarks 

• Let T^ eet denote the maximum expected meeting time of two independent realizations 
of Q. In light of the discussion in the Introduction, it would be natural to expect that 
E [C] < K2 T^ eet for some universal K% > and all Q. Is this actually true? A more 
modest question is whether the constants in the two Theorems can be improved. 

• The Meeting Time Lemma (Lemma II can be used in the study of a cat-and-mouse 
game proposed in [2j Chapter 4, page 17]. In this game a cat moves according to a 
reversible Markov chain Q. A mouse chooses a trajectory (h s ) s >o for itself and an 
initial distribution for the cat. Aldous and Fill asked if staying put at some carefully 
chosen state gives an optimal strategy for the mouse in terms of maximizing E [M] , 
where M is the meeting time of cat and mouse. One can use Lemma fl.ll to prove that 
if T^ ix -C max^E-n- [H v ] (a natural condition in many examples), then the strategy 
where the mouse stays at v and chooses q„ as the initial distribution nearly maximizes 
P (M > t) simultaneously for all t > 0. We expect to comment on this and related 
results in a upcoming note. 
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